Update environment pins; fix demo notebook; and sync to Hugging Face by sou-cheng-choi · Pull Request #6 · QMCSoftware/LDData

sou-cheng-choi · 2025-11-20T13:39:37Z

Corrects a broken pip install in env.yml by updating the qmctoolscl pip pin to version 1.1.5and qmcpy to 2.0.
Fix broken the demo notebook.

alegresor

please do not put line breaks in README. There should be an option in you editor to toggle word wrap in order to view long lines. Line breaks are hard to maintain when text is edited.

Copilot

Pull Request Overview

This PR updates environment dependencies and adds infrastructure for uploading the LDData repository to Hugging Face Datasets Hub. The main changes support transitioning from a standalone GitHub repository to a publicly accessible dataset on Hugging Face, making low-discrepancy point set parameters more discoverable and easier to use in QMC research.

Key changes:

Updates qmcpy to version 2.0 and qmctoolscl to version 1.1.5 to fix broken pip installations
Adds comprehensive upload tooling (upload.py, git_lfs_upload.sh) and GitHub Actions workflow for automated synchronization to Hugging Face
Reorganizes documentation: transforms README.md into a Hugging Face dataset card and moves technical specifications to LD_DATA.md

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
`env.yml`	Updates qmcpy and qmctoolscl versions, adds huggingface_hub dependency for upload functionality
`upload.py`	New Python script for uploading repository to Hugging Face Datasets Hub with retry logic and fallback mechanisms
`scripts/git_lfs_upload.sh`	New bash script for git-based uploads using git-lfs for large files
`README.md`	Transformed into a Hugging Face dataset card with usage examples, citations, and dataset structure documentation
`LD_DATA.md`	New file containing the original technical specification for low-discrepancy data formats (moved from old README)
`LICENSE.txt`	Adds Apache 2.0 license file
`.gitignore`	Adds patterns for Python cache files, VS Code settings, and script directories
`.github/workflows/sync-to-huggingface.yml`	New GitHub Actions workflow for automated synchronization to Hugging Face on push

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

upload.py

scripts/git_lfs_upload.sh

LD_DATA.md

.github/workflows/sync-to-huggingface.yml

env.yml

upload.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

sou-cheng-choi · 2025-11-20T20:43:03Z

@alegresor @copilot @ zitterbewegung

Pass CI tests now.

Copilot · 2025-11-20T20:43:12Z

@sou-cheng-choi I've opened a new pull request, #8, to work on those changes. Once the pull request is ready, I'll request review from you.

zitterbewegung · 2025-11-20T21:23:24Z

We should not create a wrapper on top of the huggingface_hub if we are only going to update or create a dataset. See https://huggingface.co/docs/datasets/en/upload_dataset

zitterbewegung · 2025-11-20T21:24:45Z

Dataset location is at https://huggingface.co/datasets/QMCSoftware/LDData/tree/main

alegresor

Thank you for working on this @sou-cheng-choi, it looks like you have done a lot!

May I suggest you break this into much smaller PRs that would be easier to review and faster to get merged? I would suggest the following

a PR which adds the LICENSE
a PR which removes env.yml in favor of pyproject.toml
A PR which adds your enhancements to the README.md and adds the LD_Data.md
One which adds the HuggingFace action

For adding the HF action, I must admit I do not understand what many of your files are doing. The reporuslanmv/How-to-Sync-Hugging-Face-Spaces-with-a-GitHub-Repository gives a MWE of how to sync data from a repo into HF. Based on their MWE, I would expect the suggested PR 4. would only add a single file to .github/workflows/ which automatically uploads the dataset to HF whenever something is pushed to a branch.

sou-cheng-choi · 2025-11-21T18:02:31Z

I will close this PR and break it into multiple PRs following your suggestions.

sou-cheng-choi · 2025-11-22T23:06:14Z

Reopen so that I won't forget opening sub-PRs.

Fix Demo with QMCPy 2.0

7b2d0ae

sou-cheng-choi requested a review from alegresor November 20, 2025 13:39

sou-cheng-choi changed the title ~~Update environment pin~~ Update environment pins and fix demo notebook Nov 20, 2025

alegresor approved these changes Nov 20, 2025

View reviewed changes

+line breaks

8adfd78

alegresor requested changes Nov 20, 2025

View reviewed changes

sou-cheng-choi added 6 commits November 20, 2025 08:07

Create LICENSE.txt

fbb0ded

Rename README.md to LD_DATA.md

5ab3909

-line breaks

5588961

First version of upload.py and new README

9d25c4b

+reset option

d6b1130

Add workflow

7f79bc9

sou-cheng-choi requested a review from Copilot November 20, 2025 15:43

Copilot started reviewing on behalf of sou-cheng-choi November 20, 2025 15:44 View session

Copilot finished reviewing on behalf of sou-cheng-choi November 20, 2025 15:47

Copilot AI reviewed Nov 20, 2025

View reviewed changes

sou-cheng-choi and others added 7 commits November 20, 2025 10:49

Update LD_DATA.md

72386ac

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update .github/workflows/sync-to-huggingface.yml

d1014fd

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Add unit tests and CI tests

38b111d

Update git_lfs_upload.sh

030e8d7

Update ci.yml

18c2ca2

Update ci.yml

31865bf

Fix CI Test failure

2fecace

Copilot AI mentioned this pull request Nov 20, 2025

Update environment pins and fix demo notebook #8

Closed

sou-cheng-choi requested a review from alegresor November 20, 2025 20:47

sou-cheng-choi changed the title ~~Update environment pins and fix demo notebook~~ Update environment pins; fix demo notebook; and sync to Hugging Face Nov 20, 2025

alegresor requested changes Nov 21, 2025

View reviewed changes

sou-cheng-choi closed this Nov 21, 2025

sou-cheng-choi reopened this Nov 22, 2025

Conversation

sou-cheng-choi commented Nov 20, 2025

Uh oh!

alegresor left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sou-cheng-choi commented Nov 20, 2025

Uh oh!

Copilot AI commented Nov 20, 2025

Uh oh!

zitterbewegung commented Nov 20, 2025

Uh oh!

zitterbewegung commented Nov 20, 2025

Uh oh!

alegresor left a comment

Choose a reason for hiding this comment

Uh oh!

sou-cheng-choi commented Nov 21, 2025

Uh oh!

sou-cheng-choi commented Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants